-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
migrate xpack-metricbeat #38081
migrate xpack-metricbeat #38081
Conversation
This pull request doesn't have a |
|
||
cleanup() { | ||
echo "---Terraform Cleanup" | ||
.ci/scripts/terraform-cleanup.sh "${MODULE_DIR}" #TODO: move all docker-compose files from the .ci to .buildkite folder before switching to BK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, there is a missing context what's the reason for running this terraform clean up function. A comment can help in the future.
In addition, should this particular script run only if the stage cloud
ran for the x-pack/metricbeat
? Or will it work regardless of the stages in all the beats in case they call the cleanup function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these functions will be run only when it's needed, for example: https://github.com/sharbuz/beats/blob/25547b4ef8c75e9c356573b39e0bc25c0bf5e022/.buildkite/scripts/cloud_tests.sh#L7
It will run when the script finishes with EXIT code, which means when it finishes successfully/unsuccessfully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do understand the trap function.
My main concern is regarding the cleanup function itself. If the pipeline finishes with exit 0
, then cleanup
will run, but if it fails, will the failure be reported, too? I'm not sure if that's what we want. Post-build steps for tearing down resources should not fail the build but make it unstable or notify the cloud resources manager with a message, IMO.
My team enabled a cloud-reaper mechanism that runs async to delete any leftovers in any of the cloud providers we use. That's how we can avoid having stalled resources and failing builds if the cleanup failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got what you mean, In this case I would suggest using the additional "cleanup" step with Slack notification or pipeline with email notification. @v1v WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To illustrate a bit more, the current implementation in Jenkins does not fail when tearing down the resources with terraform/docker:
Lines 945 to 950 in 3f46222
dirs?.each { folder -> // If it failed then cleanup without failing the build sh(label: 'Terraform Cleanup', script: ".ci/scripts/terraform-cleanup.sh ${folder}", returnStatus: true) } // Cleanup the docker services sh(label: 'Docker Compose Cleanup', script: ".ci/scripts/docker-services-cleanup.sh", returnStatus: true)
It uses returnStatus: true
that means do nothing, see
returnStatus : boolean (optional)
Normally, a script which exits with a nonzero status code will cause the step to fail with an exception. If this option is checked, the return value of the step will instead be the status code. You may then compare it to zero, for example.
See https://www.jenkins.io/doc/pipeline/steps/workflow-durable-task-step/#sh-shell-script
You can get the same:
either you can || true
or use your additional cleanup with soft_fail in Buildkite.
WDYT?
Sounds good to me. But let me ask @dliappis about his preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done: f432822
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few open discussions without an explicit answer or action, hence I'm blocking this.
.buildkite/scripts/install_tools.sh
Outdated
@@ -43,6 +43,10 @@ with_dependencies | |||
config_git | |||
mage dumpVariables | |||
|
|||
if [[ "$BUILDKITE_PIPELINE_SLUG" == "beats-xpack-metricbeat" && "${BUILDKITE_STEP_KEY}" == "extended-cloud-test" ]]; then | |||
startCloudTestEnv "${MODULE_DIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for using this approach is to collect all needed dependencies/tools. When it's done we'll think about using our own BK images with one/two/three commands in the command step. For example:
command: "cd $BEATS_PROJECT_NAME && mage package" |
.buildkite/scripts/common.sh
Outdated
if [[ "$BUILDKITE_PIPELINE_SLUG" == "beats-xpack-metricbeat" ]]; then | ||
withModule "${MODULE_DIR}" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
answered here
.buildkite/scripts/setenv.sh
Outdated
exportVars | ||
export RACE_DETECTOR="true" | ||
export TEST_COVERAGE="true" | ||
export DOCKER_PULL="0" | ||
export TEST_TAGS="oracle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed/changed
.buildkite/scripts/install_tools.sh
Outdated
@@ -43,6 +43,10 @@ with_dependencies | |||
config_git | |||
mage dumpVariables | |||
|
|||
if [[ "$BUILDKITE_PIPELINE_SLUG" == "beats-xpack-metricbeat" && "${BUILDKITE_STEP_KEY}" == "extended-cloud-test" ]]; then | |||
startCloudTestEnv "${MODULE_DIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested your suggestions and I faced some issues and I decided to revert my changes. I will back to this on Phase 2.
- If it's placed in the
cloud_tests.sh
it doesn't work because themetricbeat-pipeline
step requires the defined MODULE variable. - If it's placed in the
pre-command
hook it requires huge structure changes or duplicating the code.
In my opinion, we don't have enough time for this now and we have to focus on the main functionality. Whenpipeline generator
is ready - we'll think about optimization and logic again.
@v1v @dliappis WDYT?
|
||
cleanup() { | ||
echo "---Terraform Cleanup" | ||
.ci/scripts/terraform-cleanup.sh "${MODULE_DIR}" #TODO: move all docker-compose files from the .ci to .buildkite folder before switching to BK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these functions will be run only when it's needed, for example: https://github.com/sharbuz/beats/blob/25547b4ef8c75e9c356573b39e0bc25c0bf5e022/.buildkite/scripts/cloud_tests.sh#L7
It will run when the script finishes with EXIT code, which means when it finishes successfully/unsuccessfully
.buildkite/scripts/cloud_tests.sh
Outdated
trap 'teardown; unset_secrets' EXIT | ||
|
||
# Set the MODULE env variable if possible | ||
defineModuleFromTheChangeSet "${MODULE_DIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this particular function should use the beats project folder.
directory
in Jenkinsfile is set in
Line 1133 in 3f46222
directory: args.project, |
and
Line 1105 in 3f46222
* - project -> the name of the project that should match with the folder name. |
project
means.
Then directory
is used in
Line 676 in 3f46222
def module = withModule ? getCommonModuleInTheChangeSet(directory) : '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export BEATS_AWS_SECRET_KEY | ||
BEATS_AWS_ACCESS_KEY=$(retry 5 vault kv get -field access_key ${AWS_SERVICE_ACCOUNT_SECRET_PATH}) | ||
export BEATS_AWS_ACCESS_KEY | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking a bit more about defineModuleFromTheChangeSet
and how other places could use it. I think MODULE
could be set in the pre-command
hook.
fi | |
# Set the MODULE env variable | |
defineModuleFromTheChangeSet "${BEATS_PROJECT_NAME}" |
Therefore my former suggestion for adding defineModuleFromTheChangeSet
in .buildkite/scripts/cloud_tests.sh
could be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, with this approach, the MODULE
env variable won't be set in the other stages. Hence, it will take longer to run the tests.
If that's something to be done in phase2 that's totally fine with me. But PRs will be slow to be tested. Unless the defineModuleFromTheChangeSet "${BEATS_PROJECT_NAME}"
is also called in all those tests using the withModule: true
💔 Build Failed
Failed CI StepsHistory
cc @sharbuz |
💔 Build Failed
Failed CI StepsHistory
cc @sharbuz |
💚 Build Succeeded
History
cc @sharbuz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned, IIUC, the current MODULE env variable won't be set hence ITs will take longer on PRs.
If that's something to be revisited in the future, then LGTM. But if build times are important for phase 1... I'd suggest to do a follow-up to set MODULE
env variable in the required stages.
export BEATS_AWS_SECRET_KEY | ||
BEATS_AWS_ACCESS_KEY=$(retry 5 vault kv get -field access_key ${AWS_SERVICE_ACCOUNT_SECRET_PATH}) | ||
export BEATS_AWS_ACCESS_KEY | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, with this approach, the MODULE
env variable won't be set in the other stages. Hence, it will take longer to run the tests.
If that's something to be done in phase2 that's totally fine with me. But PRs will be slow to be tested. Unless the defineModuleFromTheChangeSet "${BEATS_PROJECT_NAME}"
is also called in all those tests using the withModule: true
💚 Build Succeeded
History
cc @sharbuz |
💚 Build Succeeded
History
cc @sharbuz |
💚 Build Succeeded
History
cc @sharbuz |
* migrate xpack-metricbeat (cherry picked from commit 843010c)
* migrate xpack-metricbeat (cherry picked from commit 843010c) # Conflicts: # .buildkite/scripts/generate_packetbeat_pipeline.sh
* migrate xpack-metricbeat (cherry picked from commit 843010c)
* migrate xpack-metricbeat (cherry picked from commit 843010c) Co-authored-by: sharbuz <[email protected]>
* migrate xpack-metricbeat (cherry picked from commit 843010c) Co-authored-by: sharbuz <[email protected]>
* migrate xpack-metricbeat (#38081) * migrate xpack-metricbeat (cherry picked from commit 843010c) # Conflicts: # .buildkite/scripts/generate_packetbeat_pipeline.sh * Update generate_packetbeat_pipeline.sh * Update setenv.sh --------- Co-authored-by: sharbuz <[email protected]>
What is the problem this PR solves?
Jenkins->Buildkite pipelines migration
Migrate
x-pack metricbeat
pipelineExample of the pipelines:
https://buildkite.com/elastic/beats-xpack-metricbeat/builds/623
Examples of the affected pipelines:
beats-libbeat
beats-metricbeat
beats-packetbeat
beats-winlogbeat
beats-xpack-libbeat
Related issues
https://github.com/elastic/ingest-dev/issues/1693
https://github.com/elastic/ingest-dev/issues/2993 - related to the mentioned inside the PR issue